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Abstract 

Efficient  and  robust  multigrid  solvers  for  anisotropic  problems  typically  use  either 
semi-coarsened  grids  or  implicit  smoothers  -  line  relaxation  in  2D  and  plane  relaxation 
in  3D.  However,  both  of  these  may  be  difficult  to  implement  in  codes  using  multi¬ 
block  structured  grids  where  there  may  be  no  natural  definition  of  a  global  ‘line’  or 
‘plane’.  These  multi-block  structured  grids  are  often  used  in  fluid  dynamic  applications 
to  capture  complex  geometries  and/or  to  facilitate  parallel  processing.  In  this  paper, 
we  investigate  the  performance  of  multigrid  algorithms  using  implicit  smoothers  within 
the  blocks  of  a  such  a  grid.  By  looking  at  a  model  problem,  the  2-D  anisotropic  diffusion 
equation,  we  show  that  true  multigrid  efficiency  is  achieved  only  when  the  block  sizes 
are  proportional  to  the  strength  of  the  anisotropy.  Further,  the  blocks  must  overlap  and 
the  size  of  the  overlap  must  again  be  proportional  to  the  strength  of  the  anisotropy. 


*  This  research  was  supported  in  part  by  the  National  Aeronautics  and  Space  Administration  under 
NASA  Contract  No.  NAST19480  while  the  first  author  was  in  residence  at  the  Institute  for  Computer 
Applications  in  Science  and  Engineering  (ICASE),  NASA  Langley  Research  Center,  Hampton,  VA 
23681-0001 


1.  Introduction  and  Motivation.  To  illustrate  the  main  issues  involved  in  de¬ 
signing  efficient  multigrid  solvers  for  anisotropic  problems,  consider  the  problem  of 
solving  the  partial  differential  equation 


(1) 


along  with  some  suitable  boundary  condition  on  a  rectangular  domain  fl  C  whose 
boundaries  are  parallel  to  the  xy  coordinates.  This  PDE  can  be  discretized  by  placing  a 
uniform  grid  (with  mesh  size  h)  over  the  domain  and  using  second-order  finite  difference 
approximations  for  the  derivatives.  We  associate  to  each  vertex  in  the  grid  an  index 
(/,  J)  in  the  usual  manner  and  let  Uj  j  and  Ff  j  denote  the  discrete  approximations  to 
the  continuous  variables  u  and  /  appearing  in  equation(l).  The  discrete  equations  then 
have  the  form 


(2) 
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Note  that  here  we  have  considered  an  anisotropic  PDE  discretized  on  a  grid  with  equal 
mesh  size  in  the  x  and  y  directions.  The  same  discrete  equations  arise  when  an  isotropic 
PDE  (e  =  1)  is  discretized  on  grid  with  mesh  size  h  in  the  y-direction  and  h/ y/e  in  the 
x-direction. 

Multigrid  methods  rely  on  two  processes:  a  relaxation  process  to  reduce  high- 
frequency  error  and  a  coarse  grid  correction  process  to  reduce  low-frequency  error. 
When  standard  coarsening  is  used  (i.e.  coarse  grids  are  obtained  from  fine  grids  by 
deleting  every  other  line  in  each  coordinate  direction)  error  components  that  are  oscil¬ 
latory  in  either  coordinate  direction  must  be  effectively  reduced  by  relaxation  as  these 
components  cannot  be  represented  on  the  coarser  grid.  Using  the  tool  of  local  mode 
analysis  on  the  above  discrete  equations  with  0  <  e  <C  1,  one  can  show  (see  [1])  that 
point  Gauss-Seidel  relaxation  (or  more  generally,  any  point-wise  relaxation  process) 
does  not  efficiently  reduce  error  components  that  are  oscillatory  in  the  a:-direction  but 
smooth  in  the  y-direction.  One  solution  is  to  accept  this  limitation  of  the  relaxation 
process  and  coarsen  the  grid  only  in  the  y-direction.  The  other  solution  is  line  relaxation 
in  the  y-direction.  Here  all  unknowns  sharing  the  same  I  index  are  updated  simultane¬ 
ously,  and  local  mode  analysis  shows  that  this  relaxation  process  effectively  reduces  all 
high-frequency  error  components.  Either  solution,  semi- coarsening  or  line  relaxation, 
yield  multigrid  algorithms  which  are  efficient  for  small  e. 

In  the  field  of  Computational  Fluid  Dynamics  (CFD),  discrete  formulations  of  the 
equations  representing  the  flow  around  aerodynamic  bodies  are  solved.  These  equations 
are  solved  at  discrete  points  defined  by  a  previously  generated  body-fitted  grid.  The 
body  fitted  grid  can  be  what  is  referred  to  as  a  structured  or  an  unstructured  grid.  For 
unstructured  grids,  the  relative  position  of  the  grid  points  is  explicitly  stored  and  must 
be  referred  to  by  the  flow  solver  when  calculating  the  grid  metric  quantities.  For  struc¬ 
tured  grids,  the  relative  placement  of  the  grid  points  is  implicitly  known  by  the  relative 
position  of  the  data  referring  to  the  points  in  the  program’s  data  arrays.  Unstructured 
grids  offer  greater  flexibility  to  fit  complex  geometries  but  carry  the  overhead  of  needing 
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Fig.  1.  Multi-bock  grid  for  a  multi-element  airfoil. 


to  carry  explicit  information  about  the  relative  position  of  each  point  to  its  neighbors. 
Structured  grids  are  inherently  more  efficient  computationally  but  are  limited  in  their 
ability  to  fit  complex  geometries.  Block-structured  (or  multi-block)  grids  can  be  used 
to  provide  greater  geometric  flexibility  for  structured  grids. 

In  a  block-structured  grid,  both  semi-coarsening  and  line  relaxation  may  be  difficult 
to  implement  as  there  may  be  no  natural  definition  of  a  global  line.  An  example  of  such 
a  grid  is  shown  in  figure(l).  Further,  if  the  blocks  are  stored  on  different  processors 
in  a  parallel  computer,  line-relaxation  will  require  the  solution  of  a  tridiagonal  system 
which  is  distributed  across,  perhaps,  many  processors.  These  potential  difficulties  led 
us  to  investigate  the  use  of  line-relaxation  only  within  the  blocks  of  the  grid.  We  will 
see  in  both  the  numerical  experiments  and  the  analysis,  that  line-relaxation  within  the 
blocks  is  generally  not  enough  to  obtain  efficient  multigrid  solvers.  To  obtain  the  same 
convergence  factors  as  obtained  using  point-wise  Gauss-Seidel  relaxation  of  the  isotropic 
problem,  the  blocks  must  overlap  with  their  neighboring  blocks  and  both  the  size  of  the 
blocks  and  the  overlap  must  be  proportional  to  the  strength  of  the  anisotropy. 

2.  Numerical  Experiments.  To  investigate  the  effect  of  anisotropy  strength, 
block  size  and  overlap  on  multigrid  efficiency,  we  construct  the  following  2D  test  case. 
Using  a  uniform  grid  of  size  (2^'  -|-  1)  x  (2^  -f-  1),  k  an  integer,  and  mesh  size  =  1;  the 
PDF  in  equation(l)  is  discretized  using  finite  differences  as  in  equation(2).  Dirichlet 
boundary  conditions  are  assumed  so  that  the  value  of  the  solution  u  at  extreme  grid 
points,  those  with  i  or  j  index  equal  to  1  or  2*^  + 1,  is  given.  This  leaves  (2^'  —  1)  x  (2*  —  1) 
interior  vertices  where  an  approximate  solution  fW  must  be  calculated.  We  divide  these 
vertices  into  regular  blocks  of,  as  much  as  possible,  equal  size:  M  x  M.  Note  that  if  M 
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Fig.  2.  Block-structured  grid  of  type  used  in  numerical  experiments:  33  x  33  grid  with  M  =  8. 

does  not  divide  2^'  —  1,  some  blocks  will  be  smaller:  either  M  x  (M  —  1),  (M  —  1)  x  M, 
or  [M  —  1)  X  (M  —  1).  For  example,  most  of  the  blocks  in  the  33  x  33  grid  shown  in 
figure  (2)  contain  8  interior  vertices  in  each  direction,  but  those  on  the  top  boundary 
include  only  7  interior  vertices  in  the  vertical  direction.  Likewise,  those  on  the  right 
boundary  include  only  7  interior  vertices  in  the  horizontal  direction. 

This  block  structure  only  applies  in  the  relaxation  process,  which  is  as  follows. 
Beginning  at  the  left  boundary,  each  vertical  line  is  scanned  in  turn.  Within  each 
line  the  blocks  are  scanned  (beginning  at  the  bottom  boundary)  and  unknowns  on  the 
vertical  line  that  are  within  the  block  and  the  first  S  unknowns  in  the  adjacent  blocks  are 
updated  simultaneously  to  satisfy  their  discrete  equations.  The  dark  dots  in  figure  (3) 
indicate  which  unknowns  would  be  updated  simultaneously  when  M  =  8  and  ^  =  3. 
This  relaxation  scheme  can  be  viewed  as  a  block  Gauss-Seidel  scheme  with  overlapping 
blocks.  When  M  =  1,(5  =  0  (i.e.  each  point  is  treated  as  a  block)  we  have  pointwise 
Gauss-Seidel  relaxation,  and  when  M  =  2^'  -f-  1  (i.e.  the  whole  grid  is  treated  as  one 
block)  we  have  y-line  Gauss-Seidel  relaxation. 

We  report  results  for  a  two- level  y(l,  1)  cycle.  The  fine  grid  is  257  x  257,  and  the 
global  coarse  grid  problem  (on  a  129  x  129  grid)  is  solved  exactly.  Bilinear  interpolation 
is  used  to  interpolate  the  correction  to  the  fine  grid  and  full  weighting  is  used  to  restrict 
residuals  to  the  coarse  grid.  In  the  experiments  the  right-hand  side,  /,  is  set  to  zero 
and  homogeneous  Dirichlet  boundary  conditions  are  applied.  Thus  the  solution  to 
the  PDE,  and  the  resulting  discrete  equations,  is  the  trivial  one:  u  =  0.  We  use 
a  random  initial  guess,  perform  many  v-cycles,  and  monitor  the  convergence  factor 
per  cycle  (the  £2  norm  of  the  residual  after  a  cycle  divided  by  the  £2  norm  of  the 
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Fig.  3.  Unknowns  updated  simultaneously:  M  =  S  and  <5  =  3. 

residual  before  a  cycle).  In  Table  (1)  we  report  the  asymptotic  convergence  factors 
for  various  values  of  the  parameters  e,  M,  and  6,  Here  the  asymptotic  convergence 
factor  is  the  worst  case  (largest)  convergence  factor  observed  in  50  cycles.  As  a  point 
of  reference,  the  asymptotic  convergence  factor  for  the  isotropic  problem,  e  =  1,  using 
point  Gauss-Seidel  relaxation,  M  =  1,6  =  0,  is  approximately  0,2.  If  we  want  to 
maintain  this  factor  for  anisotropic  problems  by  using  block-wise  line  relaxation,  it 
is  clear  from  the  table  that  both  the  required  size  of  the  blocks  M  and  the  amount 
of  overlap  6  grow  with  the  strength  of  the  anisotropy.  The  results  in  the  table  (in 
particular,  the  similar  convergence  factors  for  the  three  entries:  e  =  1/40,  M  =  4, 6  =  2; 
e  =  1/160,  M  ^  8,6  =  4;  and  e  =  1/640,  M  =  16,6  =  8)  suggest  the  following: 

Observation  2.1.  To  obtain  convergence  factors  similar  to  those  for  the  isotropic 
problem^  the  minimum  block  size  is  0{l/^/e).  Further^  for  this  minimum,  block  size  the 
required  overlap  is  also 

3,  Analysis.  Standard  local  mode  analysis  of  the  block- wise  line  relaxation  scheme 
used  in  the  above  numerical  experiments  is  complicated  by  the  fact  that  a  relaxation 
sweep  does  not  transform  a  given  Fourier  component  into  a  simple  multiple  of  itself. 
Because  of  this,  in  the  following  analysis  we  will  use  algebraic  arguments  to  see  what 
combination  of  parameters  e,Af,  and  6  can  guarantee  that  block- wise  line  relaxation 
smoothes  the  error  as  effectively  as  full  line  relaxation. 

We  consider  the  same  model  problem  (2D  anisotropic  diffusion  equation  discretized 
by  finite  differences  on  a  uniform  grid  of  size  n  x  n),  but  with  periodic  boundary 
conditions.  Given  some  initial  approximate  solution  U^,  the  result  of  full  line  relaxation 


4 


of  the  line  with  i  index  equal  to  I  can  be  written  as 

(3)  =  L-\hHi  -  eU/_i  -  eU/+i). 


Here 
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Given  the  same  initial  approximate  solution,  the  result  of  relaxing  the  unknowns  with 
index  =  1, N  <  n  a,s  a.  block  can  be  similarly  written  as 

(5)  =  L-\hHi[l  :  N]  -  eU7_i[l  :  N]  -  eU,+i[l  :  N]  - 

Here  we  have  used  the  notation  f/[l  ;  N]  to  denote  the  A^-dimensional  vector  formed 
from  the  first  N  components  of  f/,  and  similarly  U/_i[l  :  TV]  and  U7+i[l  :  TV].  The 
vectors  ei  and  are  the  TV-dimensional  unit  vectors  with  a  one  for  the  first  and  TV-th 
component  respectively,  and  the  TV  x  TV  matrix 
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Letting  :  TV]  denote  the  TV-dimensional  vector  formed  from  the  first  TV  compo¬ 
nents  of  this  vector  satisfies 

(6)  :  TV]  =  L-\h%[l  :  TV]  -  eUz_i[l  :  TV]  -  eUi+,[l  :  TV]  -  -  4+ieAr). 
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Defining  d  to  be  the  difference  between  the  results  of  full  line  relaxation  and  block-wise 
line  relaxation,  i.e.  dj  =  yields  (by  subtracting  equation(5) 

from  equation(6)) 

(7)  d  =  — —  Uj^^)L  ^ei  —  (u^N+i  ~ 

In  the  following,  we  consider  only  the  second  term  above;  the  analysis  for  the  first  term 
is  completely  analogous.  This  second  term  is  the  product  of  two  factors: 
the  change  induced  at  grid  point  {I,N  +  1)  by  line  relaxation,  and  L~^eN.  In  the 
analysis,  we  would  like  to  get  an  upper  bound  on  magnitude  of  d  and  see  how  this 
bound  depends  on  the  parameters  e  and  N.  In  general,  it  is  not  possible  to  bound 
the  scalar  factor  because  changing  the  value  of  the  initial  approximation 

at  the  point  {I,N  +  1)  will  not  effect  the  result  of  line  relaxation.  In  the  following, 
we  make  the  assumption  that  this  scalar  factor  is  =  1.  One  could  carry  this  factor 
throughout  the  analysis,  or  simply  multiply  our  final  bound  on  d  by  this  factor,  but 
doing  so  would  only  obscure  (but  not  invalidate)  the  points  we  are  making.  With 
these  assumptions  the  equation  relating  the  difference  between  full  line  relaxation  and 
block-wise  line  relaxation  is 


(8)  d  =  -L-^bn 

Using  Gaussian  elimination  to  solve  equation(8),  one  obtains  the  following  formula  for 
the  components  of  d: 

(9)  dj  =pj_i(e)/piv(e),j  =  l,...,fV, 

where  Pk{(-)  is  a  polynomial  in  e  of  degree  k.  The  polynomials  are  generated  by  the 
recurrence  relation 

Po(c)  =  1 

(10)  Pi(e)  =  2 +  2e 

Pk{t)  =  (2  +  2e)pfc_i(e) -pA_2(e)  k  =  2,...,N 

Defining  a*,  =  pt(e)/p/._i(e),  we  can  rewrite  the  recurrence  relation  in  the  form 


(11) 


ai  =  2  -f  2e 

—  2 -|- 2e — l/Q:^._l  k  =  2,...,N 


Theorem  3.1.  For  the  sequence  defined  by  the  above  recurrence  relation 

o/i:  >  1  +  e  +  \/2e  -f  e^,  \/k 


Proof:  Define 


g{x)  =  2  -\-2e  —  1/x. 

Then  g  E  C[l  -|-  2e,  2  -f  2e]  and  g{x)  G  [1  2e,  2  -f  2e],V.T  G  [1  -f  2e,  2  -f-  2e].  Further, 

0  <  g'{x)  <  1/(1  -f  2e)^  <  1,  for  a:;  G  (1  -f  2e,  2  -|-  2e). 
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These  conditions  guarantee  convergence  of  the  sequence  to  the  fixed  point  a  = 

1  +  e  -I-  V2e  +  (See,  for  example,  the  fixed  point  theorems  in  Chapter  2  of  [3]).  In 
addition,  by  application  of  the  mean  value  theorem 

a* -a  ^  g{ak--i)  -  g{oL) 

=  5f'(C)(«A:-i  —  a),  C  ^  (1  +  2c,  2  +  2e). 

Because  0  <  g'{x)  <lforxe  (1  +  2e,2  +  2e)  and  ai  >  a,  equation(12)  implies  that  the 
sequence  is  monotonically  decreasing.  The  theorem  follows  from  the  convergence  and 
the  monotonicity. 

From  this  theorem,  we  have  the  following  bound  on  d,  the  difference  between  the  results 
for  line  relaxation  and  block-wise  line  relaxation: 


(13) 


=  uLjPk-i{^)/pk{t) 


<  (1  +  e  +  V2e  +  e2)-i^-J'+i). 


Given  some  tolerance  7  <  1,  we  can  guarantee  that  \dj\  will  be  less  than  7  by  requiring 
that 


(14) 


j<N  + 


log  (7) 

log  (1  -h  c  -1-  v^e+T^) 


This  equation  can  be  viewed  as  a  requirement  on  TV,  it  must  be  at  least  large  enough 
so  that  the  right-hand  side  of  the  inequality  is  positive.  For  a  fixed  e,  if  we  choose 


(15) 


N  =  N,= 


log  (7) 


^  /3  —  1  y  log  ( 1  +  e  +  y/2e  +  ) 


,0  <  ^  <  1, 


then  \dj\  <  7  for  j  =  1,2,. With  this  choice  of  N  the  difference  between  line 
relaxation  and  block-wise  line  relaxation  will  be  less  than  the  tolerance  except  at  points 
near  the  end  of  the  block  -  those  with  index  {I,j),  fiN  <  j  <  N.  If  e  is  now  reduced  by 
the  factor  0  <  //  <  1,  to  meet  the  same  criterion  (|dj|  <  7  for  j  =  1, 2, . . . ,  /3N)  we  need 


(16) 


iV  =  = 


1  \  _ log  (7) 

/9  -  1/  log  (1  +  /<e  -f-  y/2iJ,e  + 


It  follows  from  equations(15)  and  (16)  that 


(17) 


N  =  N  log(l  +  ^  + + 

^log  {1  +  pe+  ^/2pe  -I- 


N,,  =  NJ^. 
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And  in  the  limit  as  e  ^  0, 

(18) 


The  results  of  this  analysis  agree  with  Observation  2.1  in  the  numerical  experiments 
section.  Namely,  equation(14)  shows  that  relaxing  only  a  segment  of  line  can  smooth 
as  effectively  as  full  line  relaxation  at  points  away  from  the  end  points  of  the  segment 
provided  that  the  length  of  the  segment,  is  large  enough.  However,  it  may  not  be  an 
effective  smoother  near  the  endpoints.  Therefore,  to  get  effective  smoothing  within  a 
block  using  block- wise  line  relaxation  we  must  extend  the  line  segments  relaxed  into  the 
neighboring  blocks.  Further,  equation(18)  shows  that  for  small  e  the  length  of  the  line 
segment  must  be  to  guarantee  effective  smoothing  away  from  the  endpoints, 

i.e.  for  (j  <  l3N).  This  means  that  block- wise  line  relaxation  can  be  an  effective 
smoother  provided  that  both  the  block  size  M  and  the  overlap  S  are  0(l/>/e). 

4.  Conclusions.  The  result  of  relating  the  necessary  block  size  and  overlap  in  a 
multi-block  line  relaxation  scheme  to  the  strength  of  the  anisotropy  that  was  illustrated 
by  the  numerical  results  in  Section  2,  appears  to  hold  for  many  schemes.  Some  of  these 
may  be  more  efficient  on  parallel  computers  than  the  scheme  in  Section  2. 

For  example,  damped  vertical  line  Jacobi  relaxation  (with  damping  parameter  tv  = 
0.7)  is  known  to  be  an  effective  smoother  for  our  model  problem  (see  [4]).  Figure  (4) 
illustrates  a  multi-block  version  of  this  relaxation  scheme  (with  block  size  M  =  6  and 
overlap  ^  1).  Given  an  initial  approximation  U,  in  step  1  a  new  approximation  V 

is  calculated  at  the  indicated  points  by  block  relaxation  of  each  of  the  marked  line 
segments  using  values  of  U  at  all  points  not  on  that  line  segment.  In  step  2,  a  new 
approximation  V  is  again  calculated  at  the  indicated  points  by  by  block  relaxation  of 
each  of  the  marked  line  segments  using  values  of  U  at  all  points  not  on  that  line  but 
the  value  of  V  calculated  in  step  1  for  points  directly  above  and  below  the  line  segment. 
All  line  segments  in  step  1  can  be  processed  simultaneously,  likewise  in  step  2.  Then 
the  final  update  is  given  by 

U^(l-cj)U  +  u;V,  a;-0.7. 

Note  that  the  scheme  is  simply  damped  vertical  line  Jacobi  when  the  grid  is  treated 
as  a  single  block.  We  have  experimented  with  this  scheme,  and  have  found  that  the 
resulting  multigrid  algorithm  is  efficient  (convergence  factors  are  as  good  or  better  than 
those  for  damped  Jacobi  on  the  isotropic  problem)  provided  that  the  block  size  and 
overlap  are  0(1 />/e). 

For  simplicity,  our  analysis  and  test  problems  have  focused  on  rectangular  grids. 
However,  we  would  expect  the  same  block  size  and  overlap  requirements  would  need  to 
be  met  for  an  efficient  multigrid  solver  on  a  more  general  block-structured  grid  -  such  as 
the  one  in  figure(l).  A  practical  algorithm  would  likely  involve  checking  to  see  if  there  is 
an  anisotropy  that  crosses  the  block  interface  and,  if  so,  constiuicting  a  sensible  extension 
of  the  line  from  one  block  to  its  neighbor.  This  is  an  easier  process  to  implement  than 
constructing  a  “global”  line  for  such  a  grid.  We  are  currently  investigating  this  idea 
(and  3D  generalizations)  with  the  aim  of  developing  an  efficient  relaxation  process  to 
be  used  in  multigrid  solvers  for  general,  three  dimensional  block-structured  grids. 
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Fig.  4.  Mulii-block  relaxation  based  on  damped  line  Jacobi 
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